Last updated: 2025-10-31

Checks: 6 1

Knit directory: Collaborations/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210523) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 429ec0b. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    analysis/.Rhistory
    Ignored:    analysis/2022_Mar2_Marinho_cache/

Unstaged changes:
    Modified:   analysis/2025_0601_Shikha.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/2025_0601_Shikha.Rmd) and HTML (docs/2025_0601_Shikha.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 429ec0b han 2025-10-30 10/30/2025
html 429ec0b han 2025-10-30 10/30/2025
Rmd 7590407 han 2025-10-22 10/22/2025
Rmd 548bdea han 2025-09-29 9/29/2025
html 548bdea han 2025-09-29 9/29/2025
Rmd 1c1caeb han 2025-08-28 8/28/2025
html 1c1caeb han 2025-08-28 8/28/2025
Rmd a345225 han 2025-08-08 8/8/2025
html a345225 han 2025-08-08 8/8/2025
Rmd 980dcbf han 2025-08-07 8/7/2025
html 980dcbf han 2025-08-07 8/7/2025
Rmd 3614e56 han 2025-08-07 8/7/2025
html 3614e56 han 2025-08-07 8/7/2025
Rmd 10206de han 2025-08-06 8/6/202
html 10206de han 2025-08-06 8/6/202
Rmd 77c218e han 2025-08-06 8/6/2025
html 77c218e han 2025-08-06 8/6/2025
Rmd 5d069dc han 2025-08-06 8/6/2025
html 5d069dc han 2025-08-06 8/6/2025
Rmd d62c51e han 2025-07-28 7/28/2025
html d62c51e han 2025-07-28 7/28/2025
Rmd 48567d2 han 2025-07-28 7/28/2025
html 48567d2 han 2025-07-28 7/28/2025
Rmd e15351d han 2025-07-28 7/28/2025
html e15351d han 2025-07-28 7/28/2025
Rmd 6d54c64 han 2025-06-05 6/5/2025
html 6d54c64 han 2025-06-05 6/5/2025

data_raw=multiplesheets((file.path(root, "..\\2025\\202506\\Shikha\\Gupta Endo Data.xlsx")))
endo_cases=data_raw$EndoCases
endo_cases=endo_cases %>% mutate(tooth=paste0(endo_cases$`Patient...1`, ":", endo_cases$`Site...4`))

post_endo_tx=data_raw$`Post Endo TX`
post_endo_tx=post_endo_tx %>% mutate(tooth=paste0(post_endo_tx$Patient, ":", post_endo_tx$Site))

#saveRDS(endo_cases, file=file.path(root, "..\\2025\\202506\\Shikha\\endo_cases.RDS"))
#saveRDS(post_endo_tx, file=file.path(root, "..\\2025\\202506\\Shikha\\post_endo_tx.RDS"))

workflow

  1. EndoCases: 45,119 teeth in total, but 30,680 unique teeth —> apply 3 root canal codes, keep the the last root canal date: 30680 teeth.

  2. post Endo Tx: 548,918 teeth, but 358,560 unique ones; among them, 192,958 Endo check yes —> apply final restoration code+endo check yes: 103,341 —> Filter restorations that occurred after or the same day as root canal: 10286 teeth (rows).

step 1: root canal

endo_cases=readRDS(file.path(root, "..\\2025\\202506\\Shikha\\endo_cases.RDS"))
post_endo_tx=readRDS(file.path(root, "..\\2025\\202506\\Shikha\\post_endo_tx.RDS"))

# Create a data frame
root_canal_codes <- data.frame(
  Code = c("D3310", "D3320", "D3330"),
  Description = c(
    "anterior RCT",
    "bicuspid RCT",
    "molar RCT"
  )
)

root_canal_codes%>%
datatable(extensions = 'Buttons',
          caption = "root canal Codes", 
            options = list(dom = 'Blfrtip',
                           buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
                           lengthMenu = list(c(10,25,50,-1),
                                             c(10,25,50,"All"))))
length(unique(endo_cases$tooth))
[1] 30672
endo_cases_with_root_canal=endo_cases %>% filter(`Procedure...5` %in% root_canal_codes$Code)%>%
  group_by(tooth) %>%
  arrange(`Date...7`) %>%
  slice_tail(n = 1) %>%  # Keeps the last root canal date
  ungroup()

endo_cases_with_root_canal$`Date...7`=as.Date(endo_cases_with_root_canal$`Date...7`)
endo_cases_with_root_canal$Birth=as.Date(endo_cases_with_root_canal$Birth)

endo_cases_with_root_canal %>%
datatable(extensions = 'Buttons',
          caption = " patient and tooth with root canal codes", 
            options = list(dom = 'Blfrtip',
                           buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
                           lengthMenu = list(c(10,25,50,-1),
                                             c(10,25,50,"All"))))
  • only keep the latest root canal if the tooth has multiple root canals at different dates.
# Count the number of each Procedure
procedure_counts <- endo_cases_with_root_canal %>%dplyr::count(`Procedure...5`)

# Bar plot
ggplot(procedure_counts, aes(x = `Procedure...5`, y = n, fill=`Procedure...5`)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label = n), vjust = -0.5, size = 4) +
  labs(
    title = "",
    x = "Procedure Code",
    y = "Count"
  ) +
  theme_minimal() +
  theme(
    text = element_text(size = 14),
    legend.position = "", 
    plot.title = element_text(hjust = 0.5, face = "bold")
  )

Version Author Date
429ec0b han 2025-10-30
a345225 han 2025-08-08
3614e56 han 2025-08-07
5d069dc han 2025-08-06

step 2: initial treatment

# Create a data frame of Final Restoration codes
final_restoration_codes <- data.frame(
  Code = c(
    "D2390", "D2391", "D2392", "D2393", "D2394",
    "D2650", "D2651", "D2652", "D2662", "D2663", "D2664",
    "D2710", "D2712", "D2720", "D2721", "D2722",
    "D2910", "D2915", "D2920", "D2932", "D2933",
    "D2940", "D2949", "D2950", "D2951", "D2952", "D2954",
    "D6068", "D6069", "D6070", "D6071", "D6072", "D6073", "D6074",
    "D6545", "D6548", "D6611",
    "D6710", "D6720", "D6721", "D6722",
    "D6740", "D6750", "D6780", "D6790", "D6793",
    "D6930", "D6971"
  ),
  Description = c(
    "Resin-based comp crown, ant.",
    "Resin-based comp - 1 surface, posterior",
    "Resin-based comp - 2 surfaces, posterior",
    "Resin-based comp - 3 surfaces, posterior",
    "Resin-based comp - 4+ surfaces, posterior",
    "Inlay - resin - 1 surface",
    "Inlay - resin - 2 surfaces",
    "Inlay - resin - 3 or more",
    "Onlay - resin - 2 surfaces",
    "Onlay - resin - 3 surfaces",
    "Onlay - resin - 4 or more",
    "Resin crown - laboratory",
    "Crown - 3/4 resin-based comp",
    "Resin crown - high noble metal",
    "Resin crown - predominately base metal",
    "Resin crown - noble metal",
    "Recement/re-bond inlay/onlay",
    "Recement cast or prefab post",
    "Recement/re-bond crown",
    "Prefabricated resin crown",
    "Prefab stainless steel crown w/ resin window",
    "Protective restoration",
    "Restorative foundation/indirect restoration",
    "Core buildup - including pins",
    "Pin retention - per tooth",
    "Post & core, indirect fabrication",
    "Prefab post and core",
    "Abutment-retainer, porcelain/ceramic FPD",
    "Abutment-ret., PFM FPD, high noble",
    "Abutment-ret., PFM FPD, base metal",
    "Abutment-ret., PFM FPD, noble metal",
    "Abut-ret., cast metal, high noble",
    "Abut-ret., cast metal, base metal",
    "Abut-ret., cast metal, noble metal",
    "Retainer, metal, resin-bonded FPD",
    "Retainer, porcelain/ceramic, bonded FPD",
    "Retainer onlay, high noble metal, 3+ surfaces",
    "Crown - indirect resin-based",
    "Crown - resin, high noble metal",
    "Crown - resin, predominately base metal",
    "Crown - resin, noble metal",
    "Retainer crown - porcelain/ceramic",
    "Retainer crown - porcelain fused to high noble metal",
    "Retainer crown - 3/4 cast, high noble metal",
    "Retainer crown - full cast, high noble metal",
    "Provisional retainer crown",
    "Recement/re-bond FPD",
    "Cast post - part of FPD retainer"
  )
)

dim(post_endo_tx)
[1] 397633      7
length(unique(post_endo_tx$tooth))
[1] 269430
post_endo_tx_with_final_restoration_codes=post_endo_tx %>% filter(Procedure %in% final_restoration_codes$Code )
#%>% filter(`Endo Check`=="Yes")  this time the data doesn't have endo check column 

final_restoration_codes%>%
datatable(extensions = 'Buttons',
          caption = "restoration codes", 
            options = list(dom = 'Blfrtip',
                           buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
                           lengthMenu = list(c(10,25,50,-1),
                                             c(10,25,50,"All"))))
dim(post_endo_tx_with_final_restoration_codes)
[1] 225363      7

time period between root canal and inital treatment

# Ensure Date columns are Date type
endo_cases_with_root_canal$`Date...7` <- as.Date(endo_cases_with_root_canal$`Date...7`)
post_endo_tx_with_final_restoration_codes$Date <- as.Date(post_endo_tx_with_final_restoration_codes$Date)

# Step 1: Rename columns for clarity
endo_root <- endo_cases_with_root_canal %>%
  rename(
    root_canal_date = `Date...7`,
    root_canal_procedure = `Procedure...5`
  )

restoration <- post_endo_tx_with_final_restoration_codes %>%
  rename(
    restoration_date = Date,
    restoration_procedure = Procedure
  )

# Step 2: Perform inner join on tooth, retaining all necessary columns explicitly
joined <- endo_root %>%
  select(`Patient...1`, tooth, root_canal_date, root_canal_procedure) %>%
  inner_join(restoration %>% select(tooth, restoration_date, restoration_procedure),
             by = "tooth")
dim(joined)
[1] 21340     6
# Step 3: Filter restorations that occurred *after* or the same day as root canal
after_root_canal <- joined %>%
  filter(restoration_date >= root_canal_date)

before_root_canal <- joined %>%
  filter(restoration_date < root_canal_date)

dim(after_root_canal)
[1] 13519     6
# Step 4: Group by root canal event (Patient + tooth + date) and get earliest restoration
first_restoration <- after_root_canal %>%
  group_by(`Patient...1`, tooth, root_canal_date) %>%
  arrange(restoration_date) %>%
  slice(1) %>%
  ungroup()

dim(first_restoration)
[1] 8387    6
# Step 5: Calculate time difference and select output columns
endo_with_tx_time <- first_restoration %>%
  mutate(
    time_between_days = as.numeric(restoration_date - root_canal_date)
  ) %>%
  select(
    `Patient...1`,
    tooth,
    root_canal_date,
    root_canal_procedure,
    first_restoration_date = restoration_date,
    first_restoration_procedure = restoration_procedure,
    time_between_days
  )

endo_with_tx_time %>%
  datatable(
    extensions = 'Buttons',
    options = list(
      dom = 'Blfrtip',
      buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
      lengthMenu = list(c(10, 25, 50, -1), c(10, 25, 50, "All"))
    )
  )
  • if a tooth has multiple restorations after root canal, keep the earliest one only.
ggplot(endo_with_tx_time, aes(x = time_between_days)) +
  geom_histogram(binwidth = 5, fill = "skyblue", color = "black") +
  xlab("Time Between Days") +
  ylab("") +
  ggtitle("") +
  theme_minimal()

Version Author Date
429ec0b han 2025-10-30
1c1caeb han 2025-08-28
a345225 han 2025-08-08
5d069dc han 2025-08-06
summary(endo_with_tx_time$time_between_days)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    0.0     0.0    21.0   141.4    80.0  6476.0 
endo_summary <- endo_with_tx_time %>%
  mutate(time_bin = case_when(
    time_between_days == 0 ~ "0",
    time_between_days >= 1  & time_between_days <= 30   ~ "1-30",
    time_between_days >= 31 & time_between_days <= 60   ~ "31-60",
    time_between_days >= 61 & time_between_days <= 90   ~ "61-90",
    time_between_days >= 91 & time_between_days <= 120  ~ "91-120",
    time_between_days >= 121 & time_between_days <= 180 ~ "121-180",
    time_between_days > 180                            ~ "180+",
    TRUE ~ NA_character_
  )) %>%
  dplyr::count(time_bin, name = "n_tooth") %>%
  arrange(factor(time_bin, levels = c("0","1-30","31-60","61-90","91-120","121-180","180+")))
endo_summary
# A tibble: 7 × 2
  time_bin n_tooth
  <chr>      <int>
1 0           2338
2 1-30        2458
3 31-60       1093
4 61-90        571
5 91-120       356
6 121-180      396
7 180+        1175
library(lubridate)

# Ensure Date columns are Date type
endo_cases_with_root_canal$Date <- as.Date(endo_cases_with_root_canal$Date)
post_endo_tx_with_final_restoration_codes$Date <- as.Date(post_endo_tx_with_final_restoration_codes$Date)

# Rename for clarity
endo <- endo_cases_with_root_canal %>%
  rename(endo_date = Date,
         endo_procedure = Procedure)

resto <- post_endo_tx_with_final_restoration_codes %>%
  rename(restoration_date = Date,
         restoration_procedure = Procedure)

# Perform many-to-many join by "tooth"
combined <- inner_join(endo, resto, by = "tooth") %>%
  # Optional: restrict to matching Patient as well
  filter(Patient.x == Patient.y) %>%
  mutate(time_between_days = as.numeric(restoration_date - endo_date)) %>%
  select(
    Patient = Patient.x,
    tooth,
    endo_date,
    endo_procedure,
    restoration_date,
    restoration_procedure,
    time_between_days
  ) %>%
  arrange(Patient, tooth, endo_date, restoration_date)

# View results

combined %>%
  datatable(
    extensions = 'Buttons',
    caption = "",
    options = list(
      dom = 'Blfrtip',
      buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
      lengthMenu = list(c(10, 25, 50, -1), c(10, 25, 50, "All"))
    )
  )

step 3: post treatment

# Create a data frame
failure_codes <- data.frame(
  Code = c("D3346", "D3347", "D3348", "D3421", "D3425", "D3410", "D3426",
           "D3430", "D3450", "D3470", "D3505", "D3920", "D7140", "D7210"),
  Description = c(
    "Anterior-retreat prev. rt. canal",
    "Bicuspid retreat prev. rt. canal",
    "Molar retreat prev. rt. canal",
    "Apicoectomy - bicuspid (1st root)",
    "Apicoectomy - molar (1st root)",
    "Apicoectomy - anterior",
    "Apicoectomy - additional roots",
    "Retrograde filling - per root",
    "Root amputation - per root",
    "Intentional reimplantation",
    "Surgical exposure of root surface w/o apicoectomy",
    "Hemisection, incl. root removal",
    "Extraction",
    "Surgical removal of erupted tooth"
  )
)


failure_codes%>%
datatable(extensions = 'Buttons',
          caption = "Extraction /failure codes", 
            options = list(dom = 'Blfrtip',
                           buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
                           lengthMenu = list(c(10,25,50,-1),
                                             c(10,25,50,"All"))))
# Step 1: Define the failure-related procedure codes
failure_codes <- c(
  "D3346", "D3347", "D3348", "D3421", "D3425",
  "D3410", "D3426", "D3430", "D3450", "D3470",
  "D3505", "D3920", "D7140", "D7210"
)

# Step 2: Extract only failure events from all procedures
# all_procedures must include: Patient, tooth, Date, Procedure
failure_events <- post_endo_tx %>% 
  #filter(`Endo Check`=="Yes") %>% 
  filter(Procedure %in% failure_codes) %>%
  rename(failure_date = Date, failure_code = Procedure) %>%
  select(Patient, tooth, failure_date, failure_code)

failure_events$failure_date=as.Date(failure_events$failure_date)

# Step 3: Join with endo_with_tx_time and filter for valid failures (after restoration); this step will filter some patients with failure date prior to restoration  
colnames(endo_with_tx_time)[1]="Patient"
endo_with_failure <- endo_with_tx_time %>%
  left_join(failure_events, by = c("Patient", "tooth")) %>%
  filter(is.na(failure_date) | failure_date >= first_restoration_date) %>%  # allow NA for no failure
  group_by(Patient, tooth, root_canal_date) %>%
  arrange(failure_date) %>%
  slice(1) %>%  # earliest failure (or NA)
  ungroup() %>%
  mutate(
    survival_days = as.numeric(failure_date - first_restoration_date),
    survival_weeks = round(survival_days / 7,2), 
    survival_months = round(survival_days / 30.44,2), 
    survival_years = round(survival_days / 365.25, 2)
  )

endo_with_failure%>%
datatable(extensions = 'Buttons',
          caption = "", 
            options = list(dom = 'Blfrtip',
                           buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
                           lengthMenu = list(c(10,25,50,-1),
                                             c(10,25,50,"All"))))

survival analysis

# Load required libraries
library(survival)
library(survminer)

# Find the latest date in the dataset (from either failure or restoration dates)
last_followup_date <- max(
  c(endo_with_failure$failure_date, endo_with_failure$first_restoration_date),
  na.rm = TRUE
)

# Fill in survival_days for censored teeth
survival_data <- endo_with_failure %>%
  mutate(
    event = ifelse(is.na(failure_date), 0, 1),
    end_date = ifelse(event == 1, failure_date, last_followup_date),
    end_date = as.Date(end_date, origin = "1970-01-01"), # ensure Date type
    survival_days = as.numeric(end_date - first_restoration_date)
  )

# Step 2: Create a Surv object
surv_object <- Surv(time = survival_data$survival_days, event = survival_data$event)

# Step 3: Fit Kaplan-Meier survival model
km_fit <- survfit(surv_object ~ 1)

# Step 4: Plot Kaplan-Meier survival curve
ggsurvplot(
  km_fit,
  data = survival_data,  # ✅ This solves the error
  conf.int = TRUE,
  xlab = "Survival Days after restoration",
  ylab = "Tooth survival probability",
  title = "Kaplan-Meier Survival Curve of Root Canal Treated Teeth",
  surv.median.line = "hv",
  risk.table = TRUE,
  risk.table.height = 0.25
)

Version Author Date
a345225 han 2025-08-08
  • if a tooth never experiences failure/extraction, the restoration works to the end of the study
# Step 1. Create event, survival_days, and time_bin
survival_data <- endo_with_failure %>%
  mutate(
    event = if_else(is.na(failure_date), 0L, 1L),  # 1 = failure, 0 = censored
    end_date = if_else(event == 1L, failure_date, last_followup_date),
    survival_days = as.numeric(end_date - first_restoration_date),
    time_bin = case_when(
      time_between_days == 0 ~ "0",
      time_between_days >= 1  & time_between_days <= 30   ~ "1-30",
      time_between_days >= 31 & time_between_days <= 60   ~ "31-60",
      time_between_days >= 61 & time_between_days <= 90   ~ "61-90",
      time_between_days >= 91 & time_between_days <= 120  ~ "91-120",
      time_between_days >= 121 & time_between_days <= 180 ~ "121-180",
      time_between_days > 180                             ~ "180+",
      TRUE ~ NA_character_
    )
  ) %>%
  filter(!is.na(time_bin), !is.na(survival_days))  # keep valid cases

survival_data <- survival_data %>%
  filter(!is.na(time_bin), !is.na(survival_days)) %>%
  mutate(time_bin = factor(
    time_bin,
    levels = c("0","1-30","31-60","61-90","91-120","121-180","180+"),
    ordered = TRUE
  ))


# Step 2. Survival object and fit by groups
surv_obj <- Surv(time = survival_data$survival_days, event = survival_data$event)
fit <- survfit(surv_obj ~ time_bin, data = survival_data)

# Step 3. Plot KM curves


ggsurvplot(
  fit,
  data = survival_data,
  pval = TRUE,
  risk.table = TRUE,
  risk.table.height=0.4,
  conf.int = FALSE,          # remove confidence band
  legend.title = NULL,
  legend.labs = gsub("time_bin=", "", names(fit$strata)),
  risk.table.fontsize = 3,
  xlab = "Survival days after restoration",
  ylab = "Survival probability",
  linetype = 1,              # solid lines
  size = 1,                  # line width
  surv.median.line = "none", # remove median line
  #censor.shape = "|",        # show censoring
  censor.size = 2,             # small censor marks
  censor.shape = NA,       # <-- hide the vertical censoring ticks
  ggtheme = theme_minimal(base_size = 12) +   # base font size
    theme(legend.text = element_text(size = 14))  # increase legend label size
)

Version Author Date
429ec0b han 2025-10-30
548bdea han 2025-09-29
1c1caeb han 2025-08-28
a345225 han 2025-08-08
5d069dc han 2025-08-06
ggsurvplot(
  fit,
  data = survival_data,
  pval = TRUE,
  risk.table = TRUE,
  risk.table.height=0.4,
  conf.int = T,          # remove confidence band
  legend.title = NULL,
  legend.labs = gsub("time_bin=", "", names(fit$strata)),
  risk.table.fontsize = 3,
  xlab = "Survival days after restoration",
  ylab = "Survival probability",
  linetype = 1,              # solid lines
  size = 1,                  # line width
  surv.median.line = "none", # remove median line
  #censor.shape = "|",        # show censoring
  censor.size = 2,             # small censor marks
  censor.shape = NA,        # <-- hide the vertical censoring ticks
  ggtheme = theme_minimal(base_size = 12) +   # base font size
    theme(legend.text = element_text(size = 14))  # increase legend label size
)

# Step 1. Create event, survival_days, and time_bin
survival_data <- endo_with_failure %>%
  mutate(
    event = if_else(is.na(failure_date), 0L, 1L),  # 1 = failure, 0 = censored
    end_date = if_else(event == 1L, failure_date, last_followup_date),
    survival_days = as.numeric(end_date - first_restoration_date),
    time_bin = case_when(
     time_between_days <= 60   ~ "0-60",
      time_between_days >= 61 & time_between_days <= 120   ~ "61-120",
      time_between_days >= 121  ~ "120+",
      TRUE ~ NA_character_
    )
  ) %>%
  filter(!is.na(time_bin), !is.na(survival_days))  # keep valid cases

survival_data <- survival_data %>%
  filter(!is.na(time_bin), !is.na(survival_days)) %>%
  mutate(time_bin = factor(
    time_bin,
    levels = c("0-60","61-120","120+"),
    ordered = TRUE
  ))


# Step 2. Survival object and fit by groups
surv_obj <- Surv(time = survival_data$survival_days, event = survival_data$event)
fit <- survfit(surv_obj ~ time_bin, data = survival_data)

# Step 3. Plot KM curves


ggsurvplot(
  fit,
  data = survival_data,
  pval = TRUE,
  risk.table = TRUE,
  risk.table.height=0.4,
  conf.int = FALSE,          # remove confidence band
  legend.title = NULL,
  legend.labs = gsub("time_bin=", "", names(fit$strata)),
  risk.table.fontsize = 3,
  xlab = "Survival days after restoration",
  ylab = "Survival probability",
  linetype = 1,              # solid lines
  size = 1,                  # line width
  surv.median.line = "none", # remove median line
  #censor.shape = "|",        # show censoring
  censor.size = 2,             # small censor marks
  censor.shape = NA,       # <-- hide the vertical censoring ticks
  ggtheme = theme_minimal(base_size = 12) +   # base font size
    theme(legend.text = element_text(size = 14))  # increase legend label size
)

Version Author Date
429ec0b han 2025-10-30
548bdea han 2025-09-29
1c1caeb han 2025-08-28
3614e56 han 2025-08-07
ggsurvplot(
  fit,
  data = survival_data,
  pval = TRUE,
  risk.table = TRUE,
  risk.table.height=0.4,
  conf.int = T,          # remove confidence band
  legend.title = NULL,
  legend.labs = gsub("time_bin=", "", names(fit$strata)),
  risk.table.fontsize = 3,
  xlab = "Survival days after restoration",
  ylab = "Survival probability",
  linetype = 1,              # solid lines
  size = 1,                  # line width
  surv.median.line = "none", # remove median line
  #censor.shape = "|",        # show censoring
  censor.size = 2,             # small censor marks
  censor.shape = NA,        # <-- hide the vertical censoring ticks
  ggtheme = theme_minimal(base_size = 12) +   # base font size
    theme(legend.text = element_text(size = 14))  # increase legend label size
)

Version Author Date
429ec0b han 2025-10-30
548bdea han 2025-09-29
1c1caeb han 2025-08-28

sessionInfo()
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 26100)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] survminer_0.5.0     survival_3.8-3      VennDiagram_1.7.3  
 [4] futile.logger_1.4.3 condsurv_1.0.0      devtools_2.4.5     
 [7] usethis_3.1.0       tidycmprsk_1.1.0    gtsummary_2.0.4    
[10] ggsurvfit_1.1.0     irr_0.84.1          lpSolve_5.6.23     
[13] readxl_1.4.3        cowplot_1.1.3       matrixStats_1.5.0  
[16] gridExtra_2.3       DT_0.33             rstatix_0.7.2      
[19] ggpubr_0.6.0        kableExtra_1.4.0    lubridate_1.9.4    
[22] forcats_1.0.0       stringr_1.5.1       dplyr_1.1.4        
[25] purrr_1.0.2         readr_2.1.4         tidyr_1.3.1        
[28] tibble_3.2.1        ggplot2_3.5.1       tidyverse_2.0.0    
[31] rprojroot_2.0.4    

loaded via a namespace (and not attached):
 [1] formatR_1.14         remotes_2.5.0        rlang_1.1.2         
 [4] magrittr_2.0.3       git2r_0.35.0         compiler_4.3.2      
 [7] systemfonts_1.2.1    vctrs_0.6.5          profvis_0.4.0       
[10] pkgconfig_2.0.3      fastmap_1.2.0        backports_1.5.0     
[13] ellipsis_0.3.2       labeling_0.4.3       KMsurv_0.1-5        
[16] utf8_1.2.4           promises_1.3.2       rmarkdown_2.29      
[19] markdown_1.13        sessioninfo_1.2.2    tzdb_0.4.0          
[22] xfun_0.50.6          cachem_1.1.0         jsonlite_1.8.9      
[25] later_1.4.1          broom_1.0.7          R6_2.6.1            
[28] bslib_0.9.0          stringi_1.8.3        car_3.1-3           
[31] pkgload_1.4.0        jquerylib_0.1.4      cellranger_1.1.0    
[34] Rcpp_1.0.11          knitr_1.49           zoo_1.8-14          
[37] httpuv_1.6.15        Matrix_1.6-1.1       splines_4.3.2       
[40] timechange_0.3.0     tidyselect_1.2.1     rstudioapi_0.17.1   
[43] abind_1.4-8          yaml_2.3.8           ggtext_0.1.2        
[46] miniUI_0.1.1.1       pkgbuild_1.4.6       lattice_0.21-9      
[49] shiny_1.10.0         withr_3.0.2          evaluate_1.0.3      
[52] lambda.r_1.2.4       urlchecker_1.0.1     xml2_1.3.6          
[55] survMisc_0.5.6       pillar_1.10.1        carData_3.0-5       
[58] whisker_0.4.1        generics_0.1.3       hms_1.1.3           
[61] commonmark_1.9.2     munsell_0.5.1        scales_1.3.0        
[64] xtable_1.8-4         glue_1.8.0           tools_4.3.2         
[67] data.table_1.16.4    ggsignif_0.6.4       fs_1.6.5            
[70] crosstalk_1.2.1      colorspace_2.1-0     Formula_1.2-5       
[73] cli_3.6.2            km.ci_0.5-6          workflowr_1.7.1     
[76] futile.options_1.0.1 viridisLite_0.4.2    svglite_2.1.3       
[79] gtable_0.3.6         sass_0.4.9           digest_0.6.33       
[82] farver_2.1.2         htmlwidgets_1.6.4    memoise_2.0.1       
[85] htmltools_0.5.8.1    lifecycle_1.0.4      mime_0.12           
[88] gridtext_0.1.5